EASY ENSEMMBLE WITH RANDOM FOREST TO HANDLE IMBALANCED DATA IN CLASSIFICATION
نویسندگان
چکیده
منابع مشابه
Using Random Forest to Learn Imbalanced Data
In this paper we propose two ways to deal with the imbalanced data classification problem using random forest. One is based on cost sensitive learning, and the other is based on a sampling technique. Performance metrics such as precision and recall, false positive rate and false negative rate, F-measure and weighted accuracy are computed. Both methods are shown to improve the prediction accurac...
متن کاملRandom Forest Based Imbalanced Data Cleaning and Classification
The given task of PAKDD 2007 data mining competition is a typical problem of learning from extremely imbalanced data set. In this paper, we propose a combination of random forest based techniques and sampling methods to identify the potential buyers. Our methods is mainly composed of two phases: data cleaning and classification, both based on random forest. Firstly, the data set is cleaned by t...
متن کاملA Novel Approach to Handle Imbalanced Data for Classification
This paper attempts to propose a particle swarm K-means optimization (PSKO)-based granular computing (GrC) model to preprocess the skewed class distribution in order to enhance the classification accuracy for class imbalance problem. The GrC model acquires knowledge from information granules rather than from numerical data. It also processes multi-dimensional and sparse data by using singular v...
متن کاملClassification of Imbalanced Marketing Data with Balanced Random Sets
With imbalanced data a classifier built using all of the data has the tendency the ignore the minority class. To overcome this problem, we propose to use an ensemble classifier constructed on the basis of a large number of relatively small and balanced subsets, where representatives from both patterns are to be selected randomly. As an outcome, the system produces the matrix of linear regressio...
متن کاملA Feature Selection Method to Handle Imbalanced Data in Text Classification
Imbalanced data problem is often encountered in application of text classification. Feature selection, which could reduce the dimensionality of feature space and improve the performance of the classifier, is widely used in text classification. This paper presents a new feature selection method named NFS, which selects class information words rather than terms with high document frequency. To im...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Fundamental Mathematics and Applications (JFMA)
سال: 2020
ISSN: 2621-6035,2621-6019
DOI: 10.14710/jfma.v3i1.7415